Mutual attention mechanism-driven lightweight semantic segmentation network

نویسندگان

چکیده

目的 在图像语义分割中,细节特征和语义特征的融合是该领域的一个难点。一些在特定网络架构下设计的专用融合模块缺乏可扩展性和普适性,自注意力虽然可以实现全局的信息捕获,但不能实现不同特征的融合,其他的注意力机制在进行掩码计算时缺少可解释性。本文根据特征图之间的关联度进行建模,提出一种互注意力机制驱动的分割模块。方法 该模块获取不同阶段的细节特征图和语义特征图,建立细节特征图上任一点和语义特征图之间的关联模型,并在关联模型的指导下对语义特征图上的特征进行聚合,作为细节特征图上该特征点的补充,从而将语义特征图上的信息融合到细节特征图上,并进一步采用相同的操作将细节特征图上的信息融合到语义特征图上,实现来自不同阶段特征图的相互融合。结果 选取5个语义分割模型进行实验,实验结果表明,在使用替换方式对BiSeNet V2 (bilateral segmentation network)进行修改之后,浮点运算量、内存占用量和模型参数数量分别下降了8.6%,8.5%和2.6%,但是平均交并比却得到了提升。在使用插入方式对另外4个网络进行修改后,所有网络的平均交并比全部得到了不同程度的提高。结论 本文提出的互注意力模块可普遍提升模型的语义分割准确度,实现不同网络模型的即插即用,具有较高的普适性。;Objective The intrinsic and semantic feature-based image fusion is a challenging issue in relevance to segmentation. Image has been developing for such domains like medical analysis, remote sensing mapping,automatic driving other related contexts. Lightweight network possible meet the requirements of real applications via alleviating constraints speed accuracy. focused on convolutional neural network(CNN)to extract image-related texture,contour,color, gray features objects,and it can be melted into as well. An efficient feature mechanism still tackled resolved terms extraction. For lightweight network,feature effect restricted by insufficient model channels-derived representation. Network architecture-specific modules are preferred but lack scalability universality. Although attention mechanisms have also widely used networks,current researches self-attention only,which limited existing maps,and information interaction difficult achieved between different maps. To requirement fusion,we design mutual mechanism-driven module features-between correlations. In this module,attention introduced stage,and exchange realized Method First,we reconstruct non-local based calculation mechanism,and mapping query, key value changed obtain module,which obtains detail characteristic map characteristics map. Then,a single mechanism-derived association built up any one point Subsequently,association model-guided aggregated map,which adds Accordingly,feature map-semantic fused Furthermore,the same operation implemented fuse from map,and attention-based finally realized. Since input preserve type,the easily embedded model. proposed share queries keys complexity alleviated effectively multiple branches-sharing models strengthen connection branches. improve performance further,the transmission coordinated through mechanism. guide splicing with cross shared effectively,the representation capability significantly enhanced,which optimize computational cost efficiency. Result verify effectiveness module,bilateral network(BiSeNet V2)-based comparative experiments show its potentials module. develop diversity module,five selected experiments,and CamVid-related public datasets training simutaneously. quantitative average results five compare amount floating-point operations,memory usage,number parameters,and cross-to-parallel ratio when added original network. experimental demonstrate more BiSeNet using optional methods. Each optimization improved 8. 6%,8. 5%,and 2. 6%, improved. intersection all networks four networks-embedded modifications,and highest gains reached 1. 14% 0. 74% each. Additionally,result analysis number channels queries,keys values,the complexity,and modeling generation. Conclusion A developed demonstrated. feature-between correlation. datasets-between compared illustrated that models,and accuracy ability universality realize plug-and-play models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Segmentation with Reverse Attention

Recent development in fully convolutional neural network enables efficient end-to-end learning of semantic segmentation. Traditionally, the convolutional classifiers are taught to learn the representative semantic features of labeled semantic objects. In this work, we propose a reverse attention network (RAN) architecture that trains the network to capture the opposite concept (i.e., what are n...

متن کامل

Image Segmentation Based on Visual Attention Mechanism

A new approach for image segmentation based on visual attention mechanism is proposed. Motivated biologically, this approach simulates the bottom-up human visual selective attention mechanism, extracts early vision features of the image and constructs the saliency map. Multiple image features such as intensity, color and orientation in multiple scales are extracted to get some feature maps. The...

متن کامل

Stacked Deconvolutional Network for Semantic Segmentation

Recent progress in semantic segmentation has been driven by improving the spatial resolution under Fully Convolutional Networks (FCNs). To address this problem, we propose a Stacked Deconvolutional Network (SDN) for semantic segmentation. In SDN, multiple shallow deconvolutional networks, which are called as SDN units, are stacked one by one to integrate contextual information and guarantee the...

متن کامل

Improving Fully Convolution Network for Semantic Segmentation

Fully Convolution Networks (FCN) have achieved great success in dense prediction tasks including semantic segmentation. In this paper, we start from discussing FCN by understanding its architecture limitations in building a strong segmentation network. Next, we present our Improved Fully Convolution Network (IFCN). In contrast to FCN, IFCN introduces a context network that progressively expands...

متن کامل

ShuffleSeg: Real-time Semantic Segmentation Network

Real-time semantic segmentation is of significant importance for mobile and robotics related applications. We propose a computationally efficient segmentation network which we term as ShuffleSeg. The proposed architecture is based on grouped convolution and channel shuffling in its encoder for improving the performance. An ablation study of different decoding methods is compared including Skip ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Image and Graphics

سال: 2023

ISSN: ['1006-8961']

DOI: https://doi.org/10.11834/jig.211127